Search CORE

11 research outputs found

Recommended from our members

A. Survey of Entity Resolution and Record Linkage Methodologies

Author: Brizan David Guy
Tansel Abdullah Uz
Publication venue: CSUSB ScholarWorks
Publication date: 07/01/2015
Field of study

A great deal of research is focused on formation of a data warehouse. This is an important area of research as it could save many computation cycles and thus allow accurate information provided to the right people at the right time. Two considerations when forming a data warehouse are data cleansing (including entity resolution) and with schema integration (including record linkage). Uncleansed and fragmented data requires time to decipher and may lead to increased costs for an organization, so data cleansing and schema integration can save a great many (human) computation cycles and can lead to higher organizational efficiency. In this study we survey the literature for the methodologies proposed or developed for entity resolution and record linkage. This survey provides a foundation for solving many problems in data warehousing. For instance, little or no research has been directed at the problem of maintenance of cleansed and linked relations

CSUSB ScholarWorks

Culture Clubs: Processing Speech by Deriving and Exploiting Linguistic Subcultures

Author: Brizan David Guy
Publication venue: CUNY Academic Works
Publication date: 01/02/2019
Field of study

Spoken language understanding systems are error-prone for several reasons, including individual speech variability. This is manifested in many ways, among which are differences in pronunciation, lexical inventory, grammar and disfluencies. There is, however, a lot of evidence pointing to stable language usage within subgroups of a language population. We call these subgroups linguistic subcultures. The two broad problems are defined and a survey of the work in this space is performed. The two broad problems are: linguistic subculture detection, commonly performed via Language Identification, Accent Identification or Dialect Identification approaches; and speech and language processing tasks taken which may see increases in performance by modeling for each linguistic subculture. The data used in the experiments are drawn from four corpora: Accents of the British Isles (ABI), Intonational Variation in English (IViE), the NIST Language Recognition Evaluation Plan (LRE15) and Switchboard. The speakers in the corpora come from different parts of the United Kingdom and the United States and were provided different stimuli. From the speech samples, two features sets are used in the experiments. A number of experiments to determine linguistic subcultures are conducted. The set of experiments cover a number of approaches including the use traditional machine learning approaches shown to be effective for similar tasks in the past, each with multiple feature sets. State-of-the-art deep learning approaches are also applied to this problem. Two large automatic speech recognition (ASR) experiments are performed against all three corpora: one, monolithic experiment for all the speakers in each corpus and another for the speakers in groups according to their identified linguistic subcultures. For the discourse markers labeled in the Switchboard corpus, there are some interesting trends when examined through the lens of the speakers in their linguistic subcultures. Two large dialogue acts experiments are performed against the labeled portion of the Switchboard corpus: one, monocultural (or monolithic ) experiment for all the speakers in each corpus and another for the speakers in groups according to their identified linguistic subcultures. We conclude by discussing applications of this work, the changing landscape of natural language processing and suggestions for future research

City University of New York

Deep Neural Network Architectures For Music Genre Classification

Author: Guy Brizan David
Middlebrook Kai
Sonar Kunal
Sudhakaran Shyam
Publication venue: USF Scholarship: a digital repository @ Gleeson Library | Geschke Center
Publication date: 26/04/2019
Field of study

With the recent advancements in technology, many tasks in fields such as computer vision, natural language processing, and signal processing have been solved using deep learning architectures. In the audio domain, these architectures have been used to learn musical features of songs to predict: moods, genres, and instruments. In the case of genre classification, deep learning models were applied to popular datasets--which are explicitly chosen to represent their genres--and achieved state-of-the-art results. However, these results have not been reproduced on less refined datasets. To this end, we introduce an un-curated dataset which contains genre labels and 30-second audio previews for approximately fifteen thousand songs from Spotify. In our work, we focus on solving automatic genre classification using deep learning and crude data. Specifically, we propose deep architectures that learn hierarchical characteristics of music using raw waveform audio rather than preprocessed audio in the form of mel-spectrograms and apply these models to the Spotify dataset. Our experiments show how deep learning architectures using unpolished data can achieve comparable results to previous state-of-the-art music classifiers using filtered data

University of San Francisco

Quantum Criticism

Author: Badgujar Ashwini
Brizan David Guy
Intrevado Paul
Wang Andrew
Yu Kai
Publication venue: USF Scholarship: a digital repository @ Gleeson Library | Geschke Center
Publication date: 26/04/2019
Field of study

Quantum Criticism scrapes data from the News Articles and performs Sentiment Analysis

University of San Francisco

Recommended from our members

Psychocomputational Models of Human Language Acquisition

Author: Brizan David Guy
Sakas William Gregory
Publication venue: eScholarship, University of California
Publication date: 01/01/2007
Field of study

eScholarship - University of California

Preface

Author: David Guy Brizan
William Gregory Sakas
Publication venue
Publication date
Field of study

In conjunction with th

CiteSeerX